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INTRODUCTION 

The paper reports on research comparing various approaches, or methodologies, for software devel- 
opment. The study focuses on the quantitative analysis of the application of certain methodo- 
logies in an experimental environment, in order to further understand their effects and better demon- 
strate their advantages in a controlled environment. A series of statistical experiments were conducted 


The paper reports on research comparing various approaches, or methodologies, for software devel- 
opment. The study focuses ontthe quantitative analysis of the application of certain methodologies 
in an experimental environment, in order to further understand their effects and better demonstrate 
their advantages in a controlled environment. A series of statistical experiments were conducted, 
comparing programming teams which used a disciplined methodology (consisting of top-down 
design, process design language usage, structured programming, chief programmer teams, and code 
reading) with programming teams and individual programmers which employed their own ad hoc 
approach. Specific details of the experimental setting, the investigative approach (used to plan, 
execute, and analyze the experiments), and some of the results of the experiments are discussed. 

The purpose of the research was to develop an investigative methodology for experimentally 
studying and quantitatively characterizing the effect of methodologies and programming environ- 
ments on software development. It involves the quantitative measurement and analysis of both 
the process and the product of software development, in manner which is minimally obstrusive 
(to those developing the software), very objective, and highly automatable. The basic premise is 
that distinctions among the groups exist both in the process and in the product. 

SPECIFICS 


Nineteen units (teams or individuals) each performed the same software development task, but 
under controlled and slightly varied conditions. Two programming factors, size of programming 
team and degree of methodological discipline, each with two levels (single individual, and three- 
person team; the ad hoc approach, and the disciplined methodology), were chosen as the indepen- 
dent variables and formed the experimental treatments. The dependent variables to be observed 
and measured were a large set (over 125) of programming aspects. The teams and individuals were 
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placed into three treatment groups, designated A, B and C (of 6, 6 and 7 units, respectively), each 
operating under a certain combination of factor-levels: 

A — individuals, ad hoc approach; 

B — three-person teams, ad hoc approach; 

C — three-person teams, disciplined methodology. 

The time and place for the experiment was Spring, 1976, in conjunction with two academic courses 
at the University of Maryland. The particular project or application to be developed was compiler 
for a small high-level language and a simple stack machine. This task was roughly a two man-month 
effort, and the resulting software systems averaged about 1200 source lines or 600 executable state- 
ments, in high-level structured-language code. The participants were advanced undergraduates and 
graduate students in the Computer Science Department. The implementation language was the 
high-level structured-programming language SIMPL-T [Basili and Turner 76], which is used exten- 
sivly in course work at the University and has string-processing capabilities similar to PL/ 1 . 

Data collection for the experiment was automated on-line, with essentially no interference to the 
programmer’s normal pattern of actions during computer sessions. Special module compilation and 
program execution processors created an historical data base of source code and test data accumu- 
lated throughout the project development. Scores corresponding to each of the programming 
aspects were extracted directly and algorithmically from this data base. 

The programming aspects represent specific automatically isolatable and observable features of the 
programming phenomenon, related to either the product or the process of software development. 
Product aspects are based on the syntactic content and organization of the symbolic source code 
which represents the complete final product developed. Process aspects are related to characteris- 
tics of the development process itself, in particular, the cost and required effort as reflected in the 
number of computer job steps (or runs) and the amount of textual revision of source code during 
development. Major headings for the particular programming aspects reported on in this study are 
listed in the accompanying table, with qualifying subcategories mentioned in square brackets. 

APPROACH 

The investigative methodology was designed and developed as a scientific and empirical solution 
to the problem of comparing software development efforts under various conditions. It was used 
to guide the planning, execution, and analysis of the set of experiments which comprise this study. 
The approach consists of eleven steps or elements, as shown in the accompanying schmatic diagram 
which charts the general flow (solid lines) and some of the interrelationships (dashed lines) among 
these elements. 
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The methodology begins with Questions of Interest, which are turned into Research Hypotheses and 
Statistical Hypotheses. The Statistical Model is very important since it governs the Experimental 
Design and several other elements. Statistical Results, corresponding directly to the Statistical Hypo- 
theses, are determined by the Colledted Data via the Statistical Test Procedures. Research Frame- 
work(s) are necessary to organize the large volume of hypotheses and results into a smaller, more 
managable form as Statistical Conclusions and Research Interpretations. 

RESULTS 

The methodology provides that the study’s results be separated into statistical conclusions, represent- 
ing factual findings, and research interpretations, representing intuitive judgements. 

For each aspect there is one statistical conclusion which states any differences observed among the 
three programming environments represented by the groups A, B, and C. These outcomes are ex- 
pressed in the form of “equations”; e.g., A<B=C means that the average score for the individual 
programmers was appreciably lower than the average scores for the ad hoc teams and the discip- 
lined teams which both had about the same average score. In addition to the null outcome (A=B=C) 
of no observed differences, there were twelve other possible outcomes, as noted in the accom- 
panying table. The table simply lists all the non-null conclusions, arranged by outcome. The values 
in the “error” column state the risk, as a probability value, of erroneously making that conclusion 
and indicate how strongly pronounced the differences were in the data. Although there is much 
fascinating material in these findings, space permits only a few particularly interesting conclusions 
to be pointed out. 

The A<B=C outcome was quite pronounced for the SEGMENTS aspect, indicating that the indi- 
viduals built their systems with fewer routines on the average than either the ad hoc teams or the 
disciplined teams, which used about the same number of routines. According to the A<B=C and 
B=C<A outcomes, the individuals had noticably less global variables and more local variables than 
both types of teams. The C=A<B outcomes for IF statements and DECISIONS indicate aspects 
where the disciplined teams behaved like the individuals and both were different than the ad hoc 
teams. For the number of COMPUTER RUNS (JOB STEPS), and several subcategories, the C < A 
=B outcomes have very low error risks and indicate that the disciplined teams out-performed both 
the individuals and the ad hoc teams in these aspects. On the number of PROGRAM CHANGES — 
a measure of the amount of cummulative textual revision of the program source code during de- 
velopment, which has been shown to correlate well with total error occurrences [Dunsmore and 
Gannon 77] — the same data scores which support the C<A<B conclusion at a high risk of error 
(0. 1 85) also support the C < A = B conclusion at a very low risk of error (0.004), indicating a strong 
distinction in terms of error-prone-ness in favor of the disciplined teams. 

One framework for the interpretation of these conclusions is the concept of how the disciplined 
methodology actually impacts the software development process and product. Prior to conducting 
the experiment, certain general beliefs (see details on accompanying slide) about the impact had 
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been formulated. Certain basic suppositions (a priori expectations), for how the experiments should 
turn out if the beliefs were true, were constructed from the general beliefs. Examination of how the 
conclusions stack up against the suppositions (how true the beliefs are) shows that none of the con- 
clusions for any of the observed programming aspects contravene the basic suppositions. Thus, the 
study’s results may be interpreted as strong experimental evidence in favor of these general beliefs. 

SUMMARY 

A practical methodology was designed and developed for experimentally and quantitatively investi- 
gating the software development phenomenon. It was employed to compare three particular soft- 
ware development environments and to evaluate the relative impact of a particular disciplined 
methodology (made up of so-called modem programming practices). The experiments were suc- 
cessful in measuring differences among programming environments and the results support the claim 
that disciplined methodology effectively improves both the process and product of software develop- 
ment. The results will be used to guide further experiments and will act as a basis for analysis of 
software development products and processes in the Software Engineering Laboratory at NASA/ 
GSFC [Easili et al. 77] . The intention is to persue this type of research, especially extending the 
study to include more sophisticated and promising programming aspects, such as Halstead’s soft- 
ware science quantities [Halstead 77] and othersoftware complexity metrics [McCabe 76] . 
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Programming Aspects 

#:k:fc}|e*:k*}|ejk:k:fc*:ie:fe:*:5fc:fc*:fc:fc:ic********************************* 


Development Process Aspects : 

COMPUTER RUNS (JOB STEPS) 

[compilations, executions, miscellaneous] 

ESSENTIAL RUNS (JOB STEPS) 

AVERAGE UNIQUE COMPILATIONS PER MODULE 
MAX UNIQUE COMPILATIONS FOR ANY ONE MODULE 
PROGRAM CHANGES 

Final Product Aspects : 

MODULES 

SEGMENTS 

SEGMENT TYPE COUNTS 
[function, procedure] 

SEGMENT TYPE PERCENTAGES 
[function, procedure] 

AVERAGE SEGMENTS PER MODULE 

LINES 

STATEMENTS 

STATEMENT TYPE COUNTS 

[:=, IF, CASE, WHILE, EXIT, (proc)CALL, RETURN] 
STATEMENT TYPE PERCENTAGES 

[:=, IF, .CASE, WHILE, EXIT, (proc)CALL, RETURN] 
AVERAGE STATEMENTS PER SEGMENT 
AVERAGE STATEMENT NESTING LEVEL 

"8 1 f DECISIONS 

o 2- ^ 

“ J jo FUNCTION CALLS 

2 | [non-intrinsic, intrinsic] 

* 2 . «- 
ET p 

3 


TOKENS 

AVERAGE TOKENS PER STATEMENT 
INVOCATIONS 

[function, procedure; non-intrinsic, intrinsic] 

AVG INVOCATIONS PER (CALLING) SEGMENT 
[function, procedure; non-intrinsic, intrinsic] 

* * * * AVG INVOCATIONS PER (CALLED) SEGMENT 

[function, procedure] 

DATA VARIABLES 
DATA VARIABLE SCOPE COUNTS 
[global, parameter, local] 

DATA VARIABLE SCOPE PERCENTAGES 
[global, parameter, local] 

AVERAGE GLOBAL VARIABLES PER MODULE 
[modified, not modified; non-entry, entry] 

AVERAGE NON-GLOBAL VARIABLES PER SEGMENT 
[parameter, local] 

PARAMETER PASSAGE TYPE PERCENTAGES 
[value, reference] 

(SEG, GLOBAL) ACTUAL USAGE PAIRS 
[modified, not modified; non-entry, entry] 

(SEG, GLOBAL) POSSIBLE USAGE PAIRS 
[modified, not modified; non-entry, entry] 

(SEG, GLOBAL) USAGE RELATIVE PERCENTAGES 
[modified, not modified; non-entry, entry] 

(SEG, GLOBAL, SEG) DATA BINDINGS 
[actual; possible; relative percentage] 
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Diagram 1. Approach Schematic 



Non-Null Conclusions, arranged by outcome 


outcome error freq programming aspect 


A < B = C 

0.0634 
0.0698 
0. 1476 
0.1614 
0.2015 
0.1271 
0.1507 
0. 1748 
0. 1227 

B = C < A 

0.1706 
0.1699 
0. 1699 
0.1936 
0.1090 

B < C = A 

0.2195 

0.2364 

0.1546 

C = A < B 

0.2134 

0.2321 

0.0780 

0.1732 

0.0196 

0.1038 

0.2065 

0.1468 

0.1732 

0.0435 

0.1861 

C < A = B 

0.0036 

0.0223 

0.0110 

0.0221 

0.1445 

0.0037 

0.0883 

0.1180 

A = B < C 


9 

SEGMENTS 
DATA VARIABLES 

DATA VARIABLE- SCOPE COUNTS Y GLOBAL 
DATA VARIABLE SCOPE COUNTS \ GLOBAL \ MODIFIED 
DATA VARIABLE SCOPE COUNTS \ NON-GLOBAL 
DATA VARIABLE SCOPE COUNTS A NON-GLOBAL \ PARAMETER 
DATA VARIABLE SCOPE PERCENTAGES \ NON-GLOBAL \ PARAMETER 
AVERAGE NON-GLOBAL VARIABLES PER SEGMENT \ PARAMETER 
(SEG, GLOBAL) POSSIBLE USAGE PAIRS 
5 

AVERAGE STATEMENTS PER SEGMENT 

AVG INVOCATIONS PER (CALLING) SEGMENT \ NON-INTRINSIC 
AVG INVOCATIONS PER (CALLED) SEGMENT 
AVG INVOCATIONS PER (CALLED) SEGMENT \ FUNCTION 
DATA VARIABLE SCOPE PERCENTAGES \ NON-GLOBAL \ LOCAL 
3 

STATEMENT TYPE PERCENTAGES \ CASE 
(SEG, GLOBAL) USAGE RELATIVE PERCENTAGES 

(SEG, GLOBAL) USAGE RELATIVE PERCENTAGES \ NOT MODIFIED \ NON-ENTRY 

11 

SEGMENT TYPE COUNTS \ FUNCTION 
STATEMENTS 

STATEMENT TYPE COUNTS \ IF 

STATEMENT TYPE COUNTS \ (PROC)CALL \ INTRINSIC 
STATEMENT TYPE COUNTS \ RETURN , 

STATEMENT TYPE PERCENTAGES \ IF 
STATEMENT TYPE PERCENTAGES \ RETURN 
DECISIONS 

INVOCATIONS \ PROCEDURE \ INTRINSIC 
INVOCATIONS \ INTRINSIC' 

(SEG, GLOBAL, SEG) DATA BINDINGS \ POSSIBLE 

8 

COMPUTER RUNS (JOB STEPS) 

COMPUTER RUNS (JOB STEPS) \ MODULE COMPILATIONS 
COMPUTER RUNS (JOB STEPS) \ MODULE COMPILATIONS \ UNIQUE 
COMPUTER RUNS (JOB STEPS) \ PROGRAM EXECUTIONS 
COMPUTER RUNS (JOB STEPS) \ MISCELLANEOUS 
ESSENTIAL RUNS (JOB STEPS) 

AVERAGE UNIQUE COMPILATIONS PER MODULE 
MAX UNIQUE COMPILATIONS FOR ANY ONE MODULE 

0 


A < B < C 
A < C < B 

0.1194 

B < C < A 

0.1232 

0.1173 

B < A < C 
C < A < B 

0.1848 

C < B < A 


0 

1 


LINES 

2 

(SEG, GLOBAL) USAGE RELATIVE PERCENTAGES \ MODIFIED \ ENTRY 
(SEG, GLOBAL) USAGE RELATIVE PERCENTAGES \ ENTRY 

0 

1 


PROGRAM CHANGES 

0 
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Research Interpretations 


General Beliefs: 

— The disciplined methodology reduces the average cost and complexity of the process. 

— The disciplined methodology can enable a programming team to compensate for their in- 
herent coordination overhead and behave more like an individual programmer in terms of 
designing and building the product. 

Basic Suppositions: 

— on process aspects: C < A , B 

— on product aspects: A<C<BorB<C<A 

Support from the conclusions: 

— process: C < A = B on 8 aspects 

C < A < B on 1 aspect 
A = B = C on 1 aspect 

— product: A < B = C on 9 aspects 

A = C < A on 5 aspects 
B < C = A on 3 aspects 
C = A < B on 11 aspects 
A < C < B on 1 aspect 
B < C < A on 2 aspects 
A = B = C on 96 aspects 

None of the conclusions for any of the observed programming aspects contravene these basic 
suppositions. 

Thus, the study’s results may be interpreted as strong experimental evidence in favor of these 
general beliefs. 


81 


Robert W. Reiter, Jr. 
University of Maryland 
Page 8 of 8 


